Calculus of Variations 



The biggest step from derivatives with one variable to derivatives with many variables is from 
one to two. After that, going from two to three was just more algebra and more complicated pictures. 
Now the step will be from a finite number of variables to an infinite number. That will require a new 
set of tools, yet in many ways the techniques are not very different from those you know. 

If you've never read chapter 19 of volume II of the Feynman Lectures in Physics, now would be a 
good time. It's a classic introduction to the area. For a deeper look at the subject, pick up MacCluer's 
book referred to in the Bibliography at the beginning of this book. 

16.1 Examples 

What line provides the shortest distance between two points? A straight line of course, no surprise 
there. But not so fast, with a few twists on the question the result won't be nearly as obvious. How 
do I measure the length of a curved (or even straight) line? Typically with a ruler. For the curved 
line I have to do successive approximations, breaking the curve into small pieces and adding the finite 
number of lengths, eventually taking a limit to express the answer as an integral. Even with a straight 
line I will do the same thing if my ruler isn't long enough. 

Put this in terms of how you do the measurement: Go to a local store and purchase a ruler. 
It's made out of some real material, say brass. The curve you're measuring has been laid out on the 
ground, and you move along it, counting the number of times that you use the ruler to go from one 
point on the curve to another. If the ruler measures in decimeters and you lay it down 100 times along 
the curve, you have your first estimate for the length, 10.0 meters. Do it again, but use a centimeter 
length and you need 1008 such lengths: 10.08 meters. 

That's tedious, but simple. Now do it again for another curve and compare their lengths. 
Here comes the twist: The ground is not at a uniform temperature. Perhaps you're making these 
measurements over a not-fully-cooled lava flow in Hawaii. Brass will expand when you heat it, so if the 
curve whose length you're measuring passes over a hot spot, then the ruler will expand when you place 
it down, and you will need to place it down fewer times to get to the end of the curve. You will measure 
the curve as shorter. Now it is not so clear which curve will have the shortest (measured) length. If 
you take the straight line and push it over so that it passes through a hotter region, then you may get 
a smaller result. 

Let the coefficient of expansion of the ruler be a, assumed constant. For modest temperature 
changes, the length of the ruler is = + aAT)i. The length of a curve as measured with this ruler 



Here I'm taking T = as the base temperature for the ruler and di is the length you would use if 
everything stayed at this temperature. With this measure for length, it becomes an interesting problem 
to discover which path has the shortest "length." The formal term for the path of shortest length is 
geodesic. 

In section 13.1 you saw integrals that looked very much like this, though applied to a different 
problem. There I looked at the time it takes a particle to slide down a curve under gravity. That time is 
the integral of dt = d£/v, where v is the particle's speed, a function of position along the path. Using 
conservation of energy, the expression for the time to slide down a curve was Eq. (13.6). 
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In that chapter I didn't attempt to answer the question about which curve provides the quickest route to 
the end, but in this chapter I will. Even qualitatively you can see a parallel between these two problems. 
You get a shorter length by pushing the curve into a region of higher temperature. You get a shorter 
time by pushing the curve lower, (larger y). In the latter case, this means that you drop fast to pick 
up speed quickly. In both cases the denominator in the integral is larger. You can overdo it of course. 
Push the curve too far and the value of f di itself can become too big. It's a balance. 

In problems 2.35 and 2.39 you looked at the amount of time it takes light to travel from one 
point to another along various paths. Is the time a minimum, a maximum, or neither? In these special 
cases, you saw that this is related to the focus of the lens or of the mirror. This is a very general 
property of optical systems, and is an extension of some of the ideas in the first two examples above. 

These questions are sometimes pretty and elegant, but are they related to anything else? Yes. 
Newton's classical mechanics can be reformulated in this language and it leads to powerful methods to 
set up the equations of motion in complicated problems. The same ideas lead to useful approximation 
techniques in electromagnetism, allowing you to obtain high-accuracy solutions to problems for which 
there is no solution by other means. 

16.2 Functional Derivatives 

It is time to get specific and to implement* these concepts. All the preceding examples can be expressed 
in the same general form. In a standard x-y rectangular coordinate system. 



This measured length depends on the path, and I've assumed that I can express the path with y as a 
function of x. No loops. You can allow more general paths by using another parametrization: x{t) and 
y{t). Then the same integral becomes 




dx \/l + y' 



,12 



Then Eq. (16.1) is 




(16.3) 




(16.4) 



The equation (16.2) has the same form 




And the travel time for light through an optical system is 




where the speed of light is some known function of the position. 



* If you find the methods used in this section confusing, you may prefer to look at an alternate 
approach to the subject as described in section 16.6. Then return here. 
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In all of these cases the output of the integral depends on the path taken. It is a functional of 
the path, a scalar-valued function of a function variable. Denote the argument by square brackets. 

I[y]= / dxF{x,y{x),y'{x)) (16.5) 

J a 

The specific F varies from problem to problem, but the preceding examples all have this general form, 
even when expressed in the parametrized variables of Eq. (16.4). 

The idea of differential calculus is that you can get information about a function if you try chang- 
ing the independent variable by a small amount. Do the same thing here. Now however the independent 
variable is the whole path, so I'll change that path by some small amount and see what happens to the 
value of the integral /. This approach to the subject is due to Lagrange. The development in section 
16.6 comes from Euler. 

'^^^^^^^^^ 

5y 



AI = I[y + 6y]-I[y] 

fb+Ab 



a+Aa 



dx F(^x,y{x) + 6y{x),y'{x) + 6y'{x)) — / dx F(^x,y{x),y'{x)) 



(16.6) 



The (small) function 6y{x) is the vertical displacement of the path in this coordinate system. 
To keep life simple for the first attack on this problem, I'll take the special case for which the endpoints 
of the path are fixed. That is, 

y 



Aa = 0, 



Ab = 0, 



5y{a) = 0, 5y{b) = 



y + 6y 



6y 



To compute the value of Eq. (16.6) use the power series expansion of F, as in section 2.5. 

OF OF OF 
F{x + Ax,y + Ay, z + Az) = F{x, y, z) + Aa; + Ay + 

d'^FiAxf d^F , , 
+ -TT^ ^ + a a Ax Ay H 



dx'^ 



dxdy 



For now look at just the lowest order terms, linear in the changes, so ignore the second order terms. In 
this application, there is no Ax. 

dF dF 

F{x, y + 5y, y' + by') = F{x, y, y') + -^5y + -^5y' 



plus terms of higher order in 6y and 5y' . 
Put this into Eq. (16.6), and 



61 



dx 



dF ^ dF ^ ; 



(16.7) 
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For example, Let F = + y'^ + y' on the interval < x < 1. Take a base path to be a straight 
line from (0, 0) to (1, 1). Choose for the change in the path 5y{x) = ex{l — x). This is simple and it 
satisfies the boundary conditions. 




I[y + 5y] 



[ dx [x^ + y^ + y''^] = [ dx [. 
Jo Jo 



x"^ + x'^ + 1^1 



5 
3 







X' 



+ {x + ex{l -x)f + + e(l - 2x)y 



(16.8) 



5 1 

- + -e 
3 6 



11 
30* 



The value of Eq. (16.7) is 

SI = [ dx [2ySy + 2y'Sy'] = [ dx [2 
Jo Jo 



xexil -x) + 2e(l - 2x)l = -e 

6 



Return to the general case of Eq. (16.7) and you will see that I've explicitly used only part of the 
assumption that the endpoint of the path hasn't moved, Aa = A6 = 0. There's nothing in the body 
of the integral itself that constrains the change in the y-direction, and I had to choose the function 5y 
by hand so that this constraint held. In order to use the equations 5y{a) = 6y{b) = more generally, 
there is a standard trick: integrate by parts. You'll always integrate by parts in these calculations. 



dFddy dF 



dx 



dy 



dx 



d fdF 



dx \dy 



7 ^y{^) 



This expression allows you to use the information that the path hasn't moved at its endpoints in the y 
direction either. The boundary term from this partial integration is 



dF 
dy 



ySy 



OF dF 
Q^{b,yib))Sy{b) - ■^{a,y{a))Sy{a) 



Put the last two equations back into the expression for 51, Eq. (16.7) and the result is 



6y 



61 

Use this expression for the same example F 

d 



rb 

I dx 

la 


'dF 




(-)] 


dy 


dx 





(16.9) 



51 



dx 



X +y +y with y{x) = x and you have 

/■I 1 

Sy = dx [2x — 0] ex{l — x) = -. 
Jo 6 



This is sort of like Eq. (8.16), 

df = G ■ df = grad f ■ df = V/ ■ df 



dxu 



dxi 



dxi + -7^dx2 



dxi 



dxo 



The differential change in the function depends linearly on the change dfin the coordinates. It is a sum 
over the terms with dxi, dx2, ■ ■ ■ ■ This is a precise parallel to Eq. (16.9), except that the sum over 
discrete index k is now an integral over the continuous index x. The change in / is a linear functional of 
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the change 5y in the independent variable y; this 6y corresponds to the change df\n the independent 
variable r'in the other case. The coefficient of the change, instead of being called the gradient, is called 
the "functional derivative" though it's essentially the same thing. 

51 dF d fdF\ .rr r 1 f J 



SI[y,Sy]= dx j-{x,y{x),y'{x)) 6y{x) (16.10) 



6y dy dx \dy' J ' J 6y 

and for a change, I've indicated explicitly the dependence of 61 on the two functions y and 5y. This 
parallels the equation (8.13). The statement that this functional derivative vanishes is called the Euler- 
Lagrange equation. 

Return to the example F = + + y'"^, then 



6l_ 6_ 
6y 5y 



/■I d 

dx [x^ + 7/2 + y'^] =2y-—2y' = 2y- 2y" 



What is the minimum value of /? Set this derivative to zero. 

y" — y = Q y{x) = Acoshx + B smhx 

The boundary conditions |/(0) = and y{\) = 1 imply y = iJsinha; where B = 1/ sinhl. The value 
of / at this point is 

/■I 1 
I[B sinhx]= rfx [x^ + 5^ sinh^ x + cosh^ x] = - + coth 1 (16.11) 
Jo 3 

Is it a minimum? Yes, but just as with the ordinary derivative, you have to look at the next order 
terms to determine that. Compare this value of I[y] = 1.64637 to the value 5/3 found for the nearby 
function y{x) = x, evaluated in Eq. (16.8). 

Return to one of the examples in the introduction. What is the shortest distance between two 
points, but for now assume that there's no temperature variation. Write the length of a path for a 
function y between fixed endpoints, take the derivative, and set that equal to zero. 



rb 

L[y] = / dx ^/l + y^, so 

J a 



6L d y' y" ^ y'"^ y" —y" 



Sy dx^i + y'2 {1+y'^f^ {l + y'^f^ 

For a minimum length then, y" = 0, and that's a straight line. Surprise! 

Do you really have to work through this mildly messy manipulation? Sometimes, but not here. 
Just notice that the derivative is in the form 

I = = » (16.12) 



SO it doesn't matter what the particular / is and you get a straight line. f{y') is a constant so y' must 
be constant too. Not so fast! See section 16.9 for another opinion. 
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16.3 Brachistochrone 

Now for a tougher example, again from the introduction. In Eq. (16.2), which of all the paths between 
fixed initial and final points provides the path of least time for a particle sliding along it under gravity. 
Such a path is called a brachistochrone. This problem was first proposed by Bernoulli (one of them), and 
was solved by several people including Newton, though it's unlikely that he used the methods developed 
here, as the general structure we now call the calculus of variations was decades in the future. 
Assume that the particle starts from rest so that E = 0, then 

For the minimum time, compute the derivative and set it to zero. 

5T_ yiT?^ d y' 



Sy 2|/3/2 dx 2Jy^J\ + y 



,12 



This is starting to look intimidating, leading to an impossibly* messy differential equation. Is there 
another way? Yes. Why must x be the independent variable? What about using yl In the general 
setup leading to Eq. (16.10) nothing changes except the symbols, and you have 

f , r^, /^ 51 OF d fdF\ 

I[x] = JdyFiy,x,x') ^ ^ = ^-^(^j (16.14) 

Equation (16.13) becomes 

rvo J I 4- 

The function x does not appear explicitly in this integral, just its derivative x' = dx/dy. This simplifies 
the functional derivative, and the minimum time now comes from the equation 

0-:7-U-7 =0 (16.16) 



6x dy \dx' 

This is much easier. d{)/dy = means that the object in parentheses is a constant 

dF 1 x' 



dx' V^Vl 



X 



12 



Solve this for x' and you get (let = C^/2g) 




so x{y) = J dy 



K'y 

l-K^y 



This is an elementary integral. Let 2a = 1/ K"^ 



x{y)= I dy—J^= I dy—= ^ = I dy ■ + " 



^2ay - ?/2 J - + 2ay -y"^ J - {y - a) 



Only improbably. See problem 16.12. 



16 — Calculus of Variations 



Make the substitution {y — of' = z m the first half of the integral and {y ~ a) = asuiO in the second 
half. 



x{y) 



1 



dz 



1 



+ 



a? cos 9d0 



-yja'^-z + ae = -^a^ - {y - a)^ + a sin-^ (^^) + ^' 



The boundary condition that a;(0) = determines C = an/2, and the other end of the curve determines 
a: x{yo) = Xq. You can rewrite this as 



x{y) = —\/2ay — y"^ + acos — - 



(16.17) 



This is a cycloid. What's a cycloid and why does this equation describe one? See problem 16.2. 
x-independent 

In Eqs. (16.15) and (16.16) there was a special case for which the dependent variable was missing from 
F. That made the equation much simpler. What if the independent variable is missing? Does that 
provide a comparable simplification? Yes, but it's trickier to find. 



^[y] = J dxF{x,y,y') 



61 dF d rdF\ _^ 



6y dy dx \dy' J 

Use the chain rule to differentiate F with respect to x. 

dF _ dF dy dF dy' dF 
dx dx dx dy dx dy' 

Multiply the Lagrange equation (16.18) by y' to get 

,aF_ ,d_dF^_ 
y lhj~y did^ ~ 

Now substitute the term y' {dF / dy) from the preceding equation (16.19). 

dF dF dy' dF , d dF ^ 

£ y' = 

dx dx dx dy' dx dy' 



(16.18) 



(16.19) 



The last two terms are the derivative of a product. 

dF_dF^_d^ 
dx dx dx 



,dF 
dy' 



y 







(16.20) 



If the function F has no explicit x in it, the second term is zero, and the equation is now a derivative 

,dF' 



A. 
dx 



F-y' 



dy' 



and 



dF 



(16.21) 



This is already down to a first order differential equation. The combination y'Fyi — F that appears on 
the left of the second equation is important. It's called the Hamiltonian. 
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16.4 Fermat's Principle 

Fermat's principle of least time provides a formulation of geometrical optics. When you don't know 
about the wave nature of light, or if you ignore that aspect of the subject, it seems that light travels in 
straight lines — at least until it hits something. Of course this isn't fully correct, because when light 
hits a piece of glass it refracts and its path bends and follows Snell's equation. All of these properties 
of light can be described by Fermat's statement that the path light follows will be that path that takes 
the least* time. 

T = [ dt = [ — = ^ [ ndi (16.22) 



The total travel time is the integral of the distance di over the speed (itself a function of position). 
The index of refraction \s n = c/v, where c is the speed of light in vacuum, so I can rewrite the travel 
time in the above form using n. The integral f ndi \s called the optical path. 

From this idea it is very easy to derive the rule for reflection at a surface: angle of incidence 
equals angle of reflection. It is equally easy to derive Snell's law. (See problem 16.5.) I'll look at an 
example that would be difficult to do by any means other than Fermat's principle; Do you remember 
what an asphalt road looks like on a very hot day? If you are driving a car on a black colored road 
you may see the road in the distance appear to be covered by what looks like water. It has a sort 
of mirror-like sheen that is always receding from you — the "hot road mirage". You can never catch 
up to it. This happens because the road is very hot and it heats the air next to it, causing a strong 
temperature gradient near the surface. The density of the air decreases with rising temperature because 
the pressure is constant. That in turn means that the index of refraction will decrease with the rising 
temperature near the surface. The index will then be an increasing function of the distance above 
ground level, n = f{y), and the travel time of light will depend on the path taken. 

I ndi = I f{y)di = I f{yWl + x'^dy = j f{y)^/l^dx (16.23) 

What is f{y)'! I'll leave that for a moment and then after carrying the calculation through for a while 
I can pick an / that is both plausible and easy to manipulate. 






y 




road ^ 


X 



Should X be the independent variable, or yt Either should work, and I chose y because it seemed 
likely to be easier. (See problem 16.6 however.) The integrand then does not contain the dependent 
variable x. 

ndi = J f{y)\/l + dy =^ ^^[/(?/)a/i + x'^] =0 
x' 



Solve for x' to get 



/(y)V2 = C2(l+x'2) SO x'='^ = ^ ^ (16.24) 



* Not always least. This just requires the first derivative to be zero; the second derivative is 
addressed in section 16.10. "Fermat's principle of stationary time" may be more accurate, but "Fermat's 
principle of least time" is entrenched in the literature. 
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At this point pick a form for the index of refraction that will make the integral easy and will still plausibly 
represent reality. The index increases gradually above the road surface, and the simplest function works: 
f{y) = no(l + cyy)- The index increases linearly above the surface. 



x{y) 



C 



C 



dy- 



1 



^nl{l + ayY -C^ ano J ^(y + l/«)2 - CV' 
This is an elementary integral. Let u = y + l/a, then u = {C / ario) cosh^. 



2n 2 







X 



C 

auQ 



de 



9 



~C~ 



{x-xq) y 



1 C 

1 cosh ((ano/C)(x — Xn)) 



C and Xo are arbitrary constants, and Xq is obviously the center of symmetry of the path. You can 
relate the other constant to the y-coordinate at that same point: C = nQ{ayo + 1). 

Because the value of « is small for any real roadway, look at the series expansion of this hyperbolic 
function to the lowest order in a. 

y^yo + a{x-Xof/2 (16.25) 

When you look down at the road you can be looking at an image of the sky. The light comes from 
the sky near the horizon down toward the road at an angle of only a degree or two. It then curves up 
so that it can enter your eye as you look along the road. The shimmering surface is a reflection of the 
distant sky or in this case an automobile — a mirage. 




16.5 Electric Fields 

The energy density in an electric field is e^E"^ /2. For the static case, this electric field is the gradient 
of a potential, E = — V0. Its total energy in a volume is then 

W = ^ j dV{V(j)f (16.26) 

What is the minimum value of this energy? Zero of course, if is a constant. That question is too 
loosely stated to be much use, but keep with it for a while and it will be possible to turn it into 
something more precise and more useful. As with any other derivative taken to find a minimum, change 
the independent variable by a small amount. This time the variable is the function (j), so really this 
quantity W can more fully be written as a functional W[(f)] to indicate its dependence on the potential 
function. 

W[<p + 5<P] - W[<P] = ^ j dV (V(0 + 5cP)f -jJdV {V<pf 
= ^ f dV (2V0 ■ V50 + {V6(j)f) 



Donald Collins, Warren Wilson College 
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Now pull out a vector identity from problem 9.36, 

v-(/^)=v/-^+/v-^ 

and apply it to the previous line with f = 5(f) and g = V0. 

W[(f) + 5(f)] - W[(f)] = eoJdV [V i5(f)V(f)) - 5(f)V^(f>] + ^-^ J dV (y5(f)f 
The divergence term is set up to use Gauss's theorem; this is the vector version of integration by parts. 

W[(f) + 5(f)] - W[(f)] = 60 j) dA-{V(f))5(f) -60 j dV 5(f)V^(f) +^ J dV (y5(f)f (16.27) 

If the value of the potential (f) is specified everywhere on the boundary, then I'm not allowed to change 
it in the process of finding the change in W . That means that 50 vanishes on the boundary. That 

— * 

makes the boundary term, the dA integral, vanish. Its integrand is zero everywhere on the surface of 
integration. 

In looking for a minimum energy I want to set the first derivative to zero, and that's the coefficient 
of the term linear in 5(f). 

5W ^2 



5(t) 



-6oV^(f) = 



The function that produces the minimum value of the total energy (with these fixed boundary conditions) 
is the one that satisfies Laplace's equation. Is it really a minimum? Yes. In this instance it's very easy 
to see that. The extra term in the change of is J dV {V5(f>)^. That is positive no matter what 5(f> 
is. 

That the correct potential function is the one having the minimum energy allows for an efficient 
approximate method to solve electrostatic problems. I'll illustrate this using an example borrowed from 
the Feynman Lectures in Physics and that you can also solve exactly: What is the capacitance of a 
length of coaxial cable? (Neglect the edge effects at the ends of the cable of course.) Let the inner 
and outer radii be a and b, and the length L. A charge density A is on the inner conductor (and 
therefore —A on the inside of the outer conductor). It creates a radial electric field of size A/27reor. 
The potential difference between the conductors is 

/■^j A Aln(6/a) 

AV = dr = ' ' (16.28) 

Ja 27reor 27reo 

The charge on the inner conductor is XL, so C = Q/ AV = 27reoi^/ ln(6/a), where = Vj, — Va- 

The total energy satisfies W = CAV^/2, so for the given potential difference, knowing the 
energy is the same as knowing the capacitance. 

This exact solution provides a laboratory to test the efficacy of a variational approximation for 
the same problem. The idea behind this method is that you assume a form for the solution 0(r). This 
assumed form must satisfy the boundary conditions, but it need not satisfy Laplace's equation. It should 
also have one or more free parameters, the more the better up to the limits of your patience. Now 
compute the total energy that this trial function implies and then use the free parameters to minimize 
it. This function with minimum energy is the best approximation to the correct potential among all 
those with your assumed trial function. 
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Let the potential at r = a be Va and at r = 6 it is V^. An example function that satisfies these 
conditions is 



0(r) = K + (T4-K)^ (16.29) 



The electric field implied by this '\s E = — V0 = f (Va — Vi^) / (b — a), a constant radial component. 
From (16.26), the energy is 

2 Ja \drj 2 Ja \ b-a J 2 \-a 

Set this to C/W^/2 to get C and you have 

^ _ b + a 

t-'approx — T^^^QJ^ ^ 

How does this compare to the exact answer, 271 e^L / hi{h / a)7 Let x = b/a. 

= ^(^ln(6/a) = ^^Inx 
C 2h-a ^ ' ' 2x-l 

x: 1.1 1.2 1.5 2.0 3.0 10.0 

ratio: 1.0007 1.003 1.014 1.04 1.10 1.41 

Assuming a constant magnitude electric field in the region between the two cylinders is clearly 
not correct, but this estimate of the capacitance gives a remarkable good result even when the ratio 
of the radii is two or three. This is true even though I didn't even put in a parameter with which to 
minimize the energy. How much better will the result be if I do? 

Instead of a linear approximation for the potential, use a quadratic. 

(j){r) = Va + a{r - a) + [5{r - af, with 0(6) = V;, 
Solve for a in terms of (3 and you have, after a little manipulation, 

(l)(r) = Va + ^v'^-r^ + P{r-a){r-h) (16.30) 



Compute the energy from this. 



b 



W = - I L2nrdr 



Rearrange this for easier manipulation by defining 2/3 = ^^V/ [b — a) and c = (a + 6)/2 then 
W = ^2L7r (^-^^^ J {{r -c) + c)dr[l+-f{r -c)]^ 

(^) ' - + - V6 + ci'ib - af/U] 
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7 is a free parameter in this calculation, replacing the original /5. To nninimize this energy, set the 
derivative dW/d^y = 0, resulting in the value 7 = — l/c. At this value of 7 the energy is 



W 



1 



(62 



jb - af 
6{b + a)_ 



(16.31) 



Except for the factor of AV^ /2 this is the new estimate of the capacitance, and to see how good it is, 
again take the ratio of this estimate to the exact value and let x = b/a. 



CL 



approx 



c 



Inx 



lx + 1 
2x-l 



{x - If 

3{xTT)\ 



(16.32) 



x: 1.1 1.2 1.5 2.0 3.0 10.0 

ratio: 1.00000046 1.000006 1.00015 1.0012 1.0071 1.093 

For only a one parameter adjustment, this provides very high accuracy. This sort of technique is the 
basis for many similar procedures in this and other contexts, especially in quantum mechanics. 

16.6 Discrete Version 

There is another way to find the functional derivative, one that more closely parallels the ordinary partial 
derivative. It is due to Euler, and he found it first, before Lagrange's discovery of the treatment that I've 
spent all this time on. Euler's method is perhaps more intuitive than Lagrange's, but it is not as easy 
to extend it to more than one dimension and it doesn't easily lend itself to the powerful manipulative 
tools that Lagrange's method does. This alternate method starts by noting that an integral is the limit 
of a sum. Go back to the sum and perform the derivative on it, finally taking a limit to get an integral. 
This turns the problem into a finite-dimensional one with ordinary partial derivatives. You have to 
decide which form of numerical integration to use, and I'll pick the trapezoidal rule, Eq. (11.15), with a 
constant interval. Other choices work too, but I think this entails less fuss. You don't see this approach 
as often as Lagrange's because it is harder to manipulate the equations with Euler's method, and 
the notation can become quite cumbersome. The trapezoidal rule for an integral is just the following 
picture, and all that you have to handle with any care are the endpoints of the integral. 

Vk+i 

Xk = a + kA, 0<k<N where = A 



N 



f 

J a 



dx f{x) = lim 



N-l 



A 



The integral Eq. (16.5) involves y' , so in the same spirit, approximate this by the centered difference. 

y'k = y'i^k) ~ {y{xk+i) - y{xk-i))/^^ 

This evaluates the derivative at each of the coordinates {xj^} instead of between them. 
The discrete form of (16.5) is now 

A 



/discrete = — F {a, y{a) , y' (a)) 
N-i 



A 



F{xk,y{xk), {y{xk+i) - y(xfc_i))/2A) A + y{b),y'{b)) 
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Not quite yet. What about y'{a) and y'{b)7 The endpoints y{a) and y{b) aren't changing, but that 
doesn't mean that the slope there is fixed. At these two points, I can't use the centered difference 
scheme for the derivative, I'll have to use an asymmetric form to give 



/discrete = (o, ?/(a) , ) - ?/(xo) ) / A) + ^ F {b, y{b) , {y{x n) ^ y{x N-i)) / ^) 

N-l 



(16.33) 



When you keep the endpoints fixed, this is a function of — 1 variables, {yj^ = y{xk)} for 
1 < k < N — 1, and to find the minimum or maximum you simply take the partial derivative with 
respect to each of them. It is not a function of any of the {x^} because those are defined and fixed by 
the partition Xj^ = a + kA. The clumsy part is keeping track of the notation. When you differentiate 
with respect to a particular yg, most of the terms in the sum (16.33) don't contain it. There are only 
three terms in the sum that contribute: £ and i ± 1. In the figure N = 5, and the £ = 2 coordinate 
(7/2) is being changed. For all the indices £ except the two next to the endpoints (1 and N — 1), this is 



A 

dy, 



'discrete 



_d_ 
dyi 



F{xe_i,ye_i, {yg -ye_2)/2A) + 
F{xe,ye,{yi+i-ye_i)/2A) + 
F{xi+i, y£+i, {y£+2 - yi)/2A) 



A 




1 2 3 4 5 

An alternate standard notation for partial derivatives will help to keep track of the manipulations: 

DiF is the derivative with respect to the first argument 
The above derivative is then 



D2F{xe,ye,{y£^i~yi^i)/2A) 
+ ^ [D3F{x£_i,y£_i, {yi - yi_2)/2A) - D:iF{xi+i,yi+i, {yi+2 - yi)/2A)] 



A 



(16.34) 



There is no DiF because the x^ is essentially an index. 

If you now take the limit A — )• 0, the third argument in each function returns to the derivative 
y' evaluated at various x^s: 



1 



D2F{xi,yi,y'g) + ^[D3F{xe_i,ye_i,y'^_^) - L'3F(x£+i,?/£+i,?/;^J] 



A 



D2F{xe,y{xe),y'ix£)) 

+ i^[D3F{xi__^,y{xi_^^),y'{xi_^)) - DsF{xg+^,y{xi_^^),y'{x£^^))] 



A 



(16.35) 



Now take the limit that A — > 0, and the last part is precisely the definition of (minus) the derivative 
with respect to x. This then becomes 



^^/disc ^ D2F{xe,ye,y'^) - ^D3F{xf,yi,y'g) 



(16.36) 
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Translate this into the notation that I've been using and you have Eq. (16.10). Why did I divide by 
A in the final step? That's the equivalent of looking for the coefficient of both the dx and the 5y in 
Eq. (16.10). It can be useful to retain the discrete approximation of Eq. (16.34) or (16.35) to the end 
of the calculation. This allows you to do numerical calculations in cases where the analytic equations 
are too hard to manipulate. 

Again, not quite yet. The two cases i = I and i = N — 1 have to be handled separately. You 
need to go back to Eq. (16.33) to see how they work out. The factors show up in different places, but 
the final answer is the same. See problem 16.15. 

It is curious that when formulating the problem this way, you don't seem to need a partial 
integration. The result came out automatically. Would that be true with some integration method 
other other than the trapezoidal rule? See problem 16.16. 

16.7 Classical Mechanics 

The calculus of variations provides a way to reformulate Newton's equations of mechanics. The results 
produce efficient techniques to set up complex problems and they give insights into the symmetries of 
the systems. They also provide alternate directions in which to generalize the original equations. 

Start with one particle subject to a force, and the force has a potential energy function U. 
Following the traditions of the subject, denote the particle's kinetic energy by T. Picture this first in 
rectangular coordinates, where T = mv'^/2 and the potential energy is U {xi, X2, X3). The functional 
S depends on the path [xi{t), X2{t), X3{t)] from the initial to the final point. The integrand is the 
Lagrangian, L = T — U . 



S[f]= L{f,f)dt, where L = T - U = —{xj + xl + xl) - U{xi,X2,X3) (16.37) 



The statement that the functional derivative is zero is 

6S dL d f dL\ dU d 



dx]^ dt \dxk / dxk dt 

Set this to zero and you have 



mxk) 



QU dj^T -* 

mxh = — TT — , or m-r^ = F (16.38) 
oxj. dt^ 



That this integral of Ldt has a zero derivative is = ma. Now what? This may be elegant, but 
does it accomplish anything? The first observation is that when you state the problem in terms of this 
integral it is independent of the coordinate system. If you specify a path in space, giving the velocity at 
each point along the path, the kinetic energy and the potential energy are well-defined at each point on 
the path and the integral S is too. You can now pick whatever bizarre coordinate system that you want 
in order to do the computation of the functional derivative. Can't you do this with F = mal Yes, but 
computing an acceleration in an odd coordinate system is a lot more work than computing a velocity. 
A second advantage will be that it's easier to handle constraints by this method. The technique of 
Lagrange multipliers from section 8.12 will apply here too. 

Do the simplest example: plane polar coordinates. The kinetic energy is 
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The potential energy is a function U of r and 0. With the of the Lagrangian defined as T ~ U, the 
variational derivative determines the equations of motion to be 



S[r,<P] 



t2 



5^_dL_ ±01 _ 

Sr dr dt df 
SS__dL_ (IdL _ _dU_ 

6(f) ~ d(t> dt ~ d(j) 



dr 
d 



mf = 



m-r{r 
dt ^ 



These are the components of = ma in polar coordinates. If the potential energy is independent of 
0, the second equation says that angular momentum is conserved: mr'^cf). 

What do the discrete approximate equations (16.34) or (16.35) look like in this context? Look 
at the case of one-dimensional motion to understand an example. The Lagrangian is 



m 



- U{x) 



Take the expression in Eq. (16.34) and set it to zero 

dXJ_ 
dx 



+ ^ [mixi - x^_2)/2A - m{xi+2 - Xe)/2A] = 



or 



m 



X£+2 - 2a;£ + X£_2 



dU 



(2A)2 dx 
This is the discrete approximation to the second derivative, as in Eq. (11.12). 



(16.39) 



16.8 Endpoint Variation 

Following Eq. (16.6) I restricted the variation of the path so that the endpoints are kept fixed. What 
if you don't? As before, keep terms to the first order, so that for example Ati,6y is out. Because the 
most common application of this method involves integrals with respect to time, I'll use that integration 
variable 



AS 



ta+Ata 
tb+Att 



dtL{t,y{t)+5y{t),y{t) + Sm)- / dt L{t,y{t),y{t)) 



dt 

ta+Ata 
tb+At,, 



Lit,y,y)+g^Sy + ^6y 



ta+Ata 
ta 



dtL{t,y,y) 



dt 



ta 



dtL{t,y{t),y{x)) 

ta 

dL ^ dL ^ .' 



Drop quadratic terms in the second line: anything involving {6y)^ or 5y6y or {5y)'^ . Similarly, drop 
terms such as AtaSy in going to the third line. Do a partial integration on the last term 



dt——rF- = —Sy 



dy dt dy 



ta 



ta 



dt \dy 



(16.40) 



The first two terms, with the Ata and Ati,, are to first order 



fth+Atf. 



ta+Ata 
ta 



dt L{t, y,y) = L {{tb, yitb),y{h)) Atb - L{{ta, y{ta),y{ta))Ata 
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This produces an expression for AS" 

AS =L{m,yiti,),ym))Ati,-L{ita,y{ta),y{ta))Ata 



dL 

+ Q^{h)5y{tb) 



^ita)Sy(ta) + 



/ dt 

Ita 


'dL 


d 


(dL\-\ 


dy 


~ It 


Kdy)_ 



6y 



(16.41) 



Up to this point the manipulation was straight-forward, though you had to keep all the algebra 
in good order. Now there are some rearrangements needed that are not all that easy to anticipate, 
adding and subtracting some terms. 

Start by looking at the two terms involving At^ and 6y{tf,) — the first and third terms. The 
change in the position of this endpoint is not simply 5y{ti,). Even \f 6y is identically zero the endpoint 
will change in both the t-direction and in the y-direction because of the slope of the curve (y) and the 
change in the value of t at the endpoint (At^). 

The total movement of the endpoint at is horizontal by the amount At^, and it is vertical by 
the amount (^6y + yAtf^Y To incorporate this, add and subtract this second term, yAt, in order to 
produce this combination as a coefficient. 





dL 
dL 

L{tb)Atb--^{tb)y{tb)Atb 



dL 



dy 



-.y 



At, 



dL 
dy 



+ 

yAtb + 6y 



dL dL 

rQ^{h)y{h)^h + -Q^{h)8y{h) 



(16.42) 



Do the same thing at ta, keeping the appropriate signs. Then denote 

dL 



P 



dy' 



H = py- L, Ay = Sy + yAt 



H is the Hamiltonian and Eq. (16.41) becomes Noether's theorem 

rh 



AS 



pAy - HAt 



+ 



dt 



ta 



dL d fdL 

dy dt \dy 



5y 



(16.43) 



If the equations of motion are satisfied, the argument of the last integral is zero. The change in S then 
comes only from the translation of the endpoint in either the time or space direction. If At is zero, and 
Ay is the same at the two ends, you are translating the curve in space — vertically in the graph. Then 



AS = pAy 



ta 



[p{tb)-p{ta)]Ay = 
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If the physical phenomenon described by this equation is invariant under spacial translation, then 
momentum is conserved. 

If you do a translation in time instead of space and S is invariant, then At is the same at the 
start and finish, and 

AS =[-H{th) + H{ta)]At = 

This is conservation of energy. Write out what H is for the case of Eq. (16.37). 

If you write this theorem in three dimensions and require that the system is invariant under 
rotations, you get conservation of angular momentum. In more complicated system, especially in field 
theory, there are other symmetries, and they in turn lead to conservation laws. For example conservation 
of charge is associated with a symmetry called "gauge symmetry" in quantum mechanical systems. 

The equation (16.10), in which the variation 6y had the endpoints fixed, is much like a directional 
derivative in multivariable calculus. For a directional derivative you find how a function changes as the 
independent variable moves along some specified direction, and in the variational case the direction was 
specified to be with functions that were tied down at the endpoints. The development of the present 
section is in the spirit of finding the derivative in all possible directions, not just a special set. 

16.9 Kinks 

In all the preceding analysis of minimizing solutions to variational problems, I assumed that everything 
is differentiable and that all the derivatives are continuous. That's not always so, and it is quite possible 
for a solution to one of these problems to have a discontinuous derivative somewhere in the middle. 
These are more complicated to handle, but just because of some extra algebra. An integral such as 
Eq. (16.5) is perfectly well defined if the integrand has a few discontinuities, but the partial integrations 
leading to the Euler-Lagrange equations are not. You can apply the Euler-Lagrange equations only in 
the intervals between any kinks. 

If you're dealing with a problem of the standard form I[x] = fj^ dt L{t, x, x) and you want to 
determine whether there is a kink along the path, there are some internal boundary conditions that have 
to hold. Roughly speaking they are conservation of momentum and conservation of energy, Eq. (16.44), 
and you can show this using the results of the preceding section on endpoint variations. 



s 



ta 



dtL{t, X, x) 



dtL 



ta 



dtL 



+ 



tr 



Assume there is a discontinuity in the derivative i; at a point in the middle, tm- The equation to solve is 
still 6S/6x = 0, and for variations of the path that leave the endpoints and the middle point alone you 
have to satisfy the standard Euler-Lagrange differential equations on the two segments. Now however 
you also have to set the variation to zero for paths that leave the endpoints alone but move the middle 
point. 





H \ H 

ta tm tfj ta tm tij 

Apply Eq. (16.43) to each of the two segments, and assume that the differential equations are 
already satisfied in the two halves. For the sort of variation described in the last two figures, look at 
the endpoint variations of the two segments. They produce 



5S 



ta 



Crn r ~\ ''h r -I r -It 

pAx - HAt + pAx - HAt = [pAx - HAt] (t" ) - [pAx - HAt] (t+ ) = 
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These represent the contributions to the variation just above tm and just below it. This has to vanish 
for arbitrary At and Ax, so it says 

Pit;n)=Pitt^) and Hit^) = H{t^) (16.44) 

These equations, called the Weierstrass-Erdmann conditions, are two equations for the values of the 
derivative, x, on the two side of tm- The two equations for the two unknowns may tell you that there 
is no discontinuity in the derivative, or if there is then it will dictate the algebraic equations that the 
two values of x must satisfy. More dimensions means more equations of course. 

There is a class of problems in geometry coming under the general 
heading of Plateau's Problem. What is the minimal surface that spans a 
given curve? Here the functional is / dA, giving the area as a function of 
the function describing the surface. If the curve is a circle in the plane, then 
the minimum surface is the spanning disk. What if you twist the circle so 
that it does not quite lie in a plane? Now it's a tough problem. What if you 
have two parallel circles? Is the appropriate surface a cylinder? (No.) This 
subject is nothing more than the mathematical question of trying to describe 
soap bubbles. They're not all spheres. 

Do kinks happen often? They are rare in problems that usually come up in physics, and it seems 
to be of more concern to those who apply these techniques in engineering. For an example that you 
can verify for yourself however, construct a wire frame in the shape of a cube. You can bend wire or 
you can make it out of solder, which is much easier to manipulate. Attach a handle to one corner so 
that you can hold it. Now make a soap solution that you can use to blow bubbles. (A trick to make it 
work better is to add some glycerine.) Now dip the wire cube into the soap and see what sort of soap 
film will result, filling in the surfaces among the wires. It is not what you expect, and has several faces 
that meet at surprising angles. There is a square in the middle. This surface has minimum area among 
surfaces that span the cube. 

Example 

In Eq. (16.12), looking at the shortest distance between two points in a plane, I jumped to a conclusion. 
To minimize the integral fj^ f(y')dx, use the Euler-Lagrange differential equation; 

dy dxdy' ^^^'^ 

This seems to say that f{y') is a constant or that y" = 0, implying either way that y = Ax + B, a 
straight line. Now that you know that solutions can have kinks, you have to look more closely. Take 
the particular example 

f{y') = ay'^-Py'^ with y(0) = 0, and y{a) = b (16.45) 

One solution corresponds to y" = and y{x) = bx/a. Can there be others? 

Apply the conditions of Eq. (16.44) at some point between and a. Call it Xm, and assume 
that the derivative is not continuous. Call the derivatives on the left and right {y'~) and {y''^). The 
first equation is 

dL 

Aa{y'-f - 2l3{y'-) = Mv'^f - Wiv'^) 

[{y'-) - [{y'^f + (y'^W-) + {y'-f - PIM = o 
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If the slope is not continuous, the second factor must vanish. 

{y'+f + {y'+){y'-) + {y'-?-PI'io, = Q 

This is one equation for the two unknown slopes. For the second equation, use the second condition, 
the one on H . 

H = y'^-f, and //(x" ) = ) 
H = y' [Aa/' - 2(3y'] - [ay'' - M^] = Say'' - ^y" 

[iy'-) - iv'^)] [iv'^f + iy'^fiv'-) + {y'^W- f + iy'~f - Kiv'^) + (i/"))/3«] = o 

Again, if the slope is not continuous this is 

{y'^f + {y'^?{y'-) + (y'^W-f + (y'-f - + (2/'"))/3« = o 

These are two equations in the two unknown slopes. It looks messy, but look back at H itself first. It's 
even. That means that its continuity will always hold if the slope simply changes sign. 

iv'^) = -{y'-) 

Can this work in the other (momentum) equation? 

{y'^f + (y'^W-) + (y'-f - /3/2« = O is now {y'+f = f3/2a 
As long as a and (3 have the same sign, this has the solution 

(2/'+) = ±03/2^, iy'-) = (16.46) 

The boundary conditions on the original problem were y{0) = and y{a) = b. Denote 7 = ±y/P/2a, 
and Xi = a/2 + b/2'j, then 




(16.47) 



The paths labeled 0, 1, and 2 are three solutions that make the variational functional derivative vanish. 
Which is smallest? Does that answer depend on the choice of the parameters? See problem 16.19. 

Are there any other solutions? After all, once you've found three, you 
should wonder if it stops there. Yes, there are many — infinitely many in 
this example. They are characterized by the same slopes, ±7, but they switch 
back and forth several times before coming to the endpoint. The same internal 
boundary conditions {p and H) apply at each corner, and there's nothing in 
their solutions saying that there is only one such kink. 

Do you encounter such weird behavior often, with an infinite number of solutions? No, but you 
see from this example that it doesn't take a very complicated integrand to produce such a pathology. 
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16.10 Second Order 

Except for a couple of problems in optics in chapter two, 2.35 and 2.39, I've mostly ignored the question 
about minimum versus maximum. 

• Does it matter in classical mechanics whether the integral, f Ldt is minimum or not in determining 
the equations of motion? No. 

• In geometric optics, does it matter whether Fermat's principle of least time for the path of the light 
ray is really minimum? Yes, in this case it does, because it provides information about the focus. 

• In the calculation of minimum energy electric potentials in a capacitor does it matter? No, but only 
because it's always a minimum. 

• In problems in quantum mechanics similar to the electric potential problem, the fact that you're 
dealing sometimes with a minimum and sometimes not leads to some serious technical difficulties. 

How do you address this question? The same way you do in ordinary calculus: See what happens 
to the second order terms in your expansions. Take the same general form as before and keep terms 
through second order. Assume that the endpoints are fixed. 



rb 

I[y] = / dxF{x,y{x),y'{x)) 

J a 



AI = I[y + Sy]~-I[y] 

rb f-b 

dx F(x,y{x) + 5y{x),y' {x) + 5y' {x)) — I dxF(x,y{x),y'{x)) 

a Ja 

\dF dF d'^F 2 d'^F d'^F 2 

la Wy^&y'^y'-'&^^^y^ ^^&y&y'^y^y'^dr^^^'^ 



(16.48) 



If the first two terms combine to zero, this says the first derivative is zero. Now for the next terms. 

Recall the similar question that arose in section 8.11. How can you tell if a function of two 
variables has a minimum, a maximum, or neither? The answer required looking at the matrix of all the 
second derivatives of the function — the Hessian. Now, instead of a 2 x 2 matrix as in Eq. (8.31) you 
have an integral. 




In the two dimensional case V/ = defines a minimum if the product (^df,Hdf) is positive for all 
possible directions dr. For this new case the "directions" are the possible functions 5y and its derivative 

Sy'. 

The direction to look first is where 5y' is big. The reason is that I can have a very small 5y that 
has a very big 6y': 10~^sm (lO^a;). If A/ is to be positive in every direction, it has to be positive in 
this one. That requires Fyiyi > 0. 
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Is it really that simple? No. First the 6y terms can be important too, and second y can itself have 
several components. Look at the latter first. The final term in Eq. (16.48) should be 

This set of partial derivatives of F is at each point along the curve a Hessian. At each point it has a set 
of eigenvalues and eigenvectors, and if all along the path all the eigenvalues are always positive, it meets 
the first, necessary conditions for the original functional to be a minimum. If you look at an example 
from mechanics, for which the independent variable is time, these terms are then Xn instead. Terms 
such as these typically represent kinetic energy and you expect that to be positive. 
An example: 

S[f]= f dtL{x,y,x,y,t) = f dt]-[x^ + y"^ + 2'^txy - x^ - y'^] 
Jo Jo 2 

This is the action for a particle moving in two dimensions {x,y) with the specified Lagrangian. The 
equation of motion are 

ss .. 

_ = -x-x--i{ty + y) = 

j^ = ~y~y~ ^(^^ + x) = 

If 7 = you have two independent harmonic oscillators. 

The matrix of derivatives of L with respect to x = yi and i) = y2 \s 

d^L ^ / 1 7t 
dymdyn \lt 1 

The eigenvalues of this matrix are 1 ± 7^, with corresponding eigenvectors [\\ and ( \ )■ This 



Hessian then says that S should be a minimum up to the time t = I/7, but not after that. This is also 
a singular point of the differential equations for x and y. 

Focus 

When the Hessian made from the 5y''^ terms has only positive eigenvalues everywhere, the preceding 
analysis might lead you to believe that the functional is always a minimum. Not so. That condition is 
necessary; it is not sufficient. It says that the functional is a minimum with respect to rapidly oscillating 
5y. It does not say what happens if 5y changes gradually over the course of the integral. If this 
happens, and if the length of the interval of integration is long enough, the 5y' terms may be the small 
ones and the {Sy)"^ may then dominate over the whole length of the integral. This is exactly what 
happens in the problems 2.35, 2.39, and 16.17. 

When this happens in an optical system, where the functional T = f di/v is the travel time 
along the path, it signals something important. You have a focus. An ordinary light ray obeys Fermat's 
principle that T is stationary with respect to small changes in the path. It is a minimum if the path is 
short enough. A focus occurs when light can go from one point to another by many different paths, 
and for still longer paths the path is neither minimum nor maximum. 
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In the integral for T, where the starting point and the ending point are the source and image points, 
the second order variation will be zero for these long, gradual changes in the path. The straight-line 
path through the center of the lens takes least time if its starting point and ending point are closer 
than this source and image. The same path will be a saddle (neither maximum nor minimum) if the 
points are farther apart than this. This sort of observation led to the development of the mathematical 
subject called "Morse Theory," a topic that has had applications in studying such diverse subjects as 
nuclear fission and the gravitational lensing of light from quasars. 

Thin Lens 

This provides a simple way to understand the basic equation for a thin lens. Let its thickness be t and 
its radius r. 




Light that passes through this lens along the straight line through the center moves more slowly as it 
passes through the thickness of the lens, and takes a time 



1 



Ti = -{p + q 



c 



Light that take a longer path through the edge of the lens encounters no glass along the way, and it 
takes a time 

1 



To 



Vp' 



v/g2 + r2 



If p and q represent the positions of a source and the position of its image at a focus, these two times 
should be equal. At least they should be equal in the approximation that the lens is thin and when you 
keep terms only to the second order in the variation of the path. 



T2 = - 
Equate Ti and T2 



pJl+r'^/p^ + qJl+r'^/q^ = - p{l + r^/2p^) +q{l + r^/2q^) 



{p + q 



-t) 
{n - 



- nt 

l)t 
1 



1 

- + - 

p q 



p{l + r'^/2p^) + q[l + rV2g^) 



2p^2q 
2{n - l)t 



1 
7 



(16.49) 



This is the standard equation describing the focusing properties of thin lenses as described in every 
elementary physics text that even mentions lenses. The focal length of the lens is then / = r^/2(n— l)t. 
That is not the expression you usually see, but it is the same. See problem 16.21. Notice that this 
equation for the focus applies whether the lens is double convex or plano-convex or meniscus: ). If you 
allow the thickness t to be negative (equivalent to saying that there's an extra time delay at the edges 
instead of in the center), then this result still works for a diverging lens, though the analysis leading up 
to it requires more thought. 
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Exercises 

1 For the functional F[x] = x{0) + dt {x{t)'^ + x{tf) and the function x{t) = evaluate 
F[x]. 

2 For the functional F[x] = dtx{t)^ with the boundary conditions x{0) = and x{l) = 1, what 
is the minimum value of F and what function x gives it? Start by drawing graphs of various x that 
satisfy these boundary conditions. Is there any reason to require that x be a continuous function oft? 

3 With the functional F of the preceding exercise, what is the functional derivative 5F/5xl 
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Problems 



16.1 You are near the edge of a lake and see someone in the water needing help. What path do you 
take to get there in the shortest time? You can run at a speed Vi on the shore and swim at a probably 
slower speed f 2 in the water. Assume that the edge of the water forms a straight line, and express your 
result in a way that's easy to interpret, not as the solution to some quartic equation. Ans; Snell's Law. 

16.2 The cycloid is the locus of a point that is on the edge of a circle that is itself rolling along a 
straight line — a pebble caught in the tread of a tire. Use the angle of rotation as a parameter and 
find the parametric equations for x{6) and y{9) describing this curve. Show that it is Eq. (16.17). 

16.3 In Eq. (16.17), describing the shortest-time slide of a particle, what is the behavior of the function 
for y <^ at In figuring out the series expansion of w = cos^^(l —t), you may find it useful to take the 
cosine of both sides. Then you should be able to find that the two lowest order terms in this expansion 
are w = \/2t - t^l'^jVl^fl. You will need both terms. Ans: x = \/2y^/a /3 

16.4 The dimensions of an ordinary derivative such as dx/dt is the quotient of the dimensions of 
the numerator and the denominator (here L/T). Does the same statement apply to the functional 
derivative? 



16.5 Use Fermat's principle to derive both Snell's law and the law of reflection at 
a plane surface. Assume two straight line segments from the starting point to the ^ 
ending point and minimize the total travel time of the light. The drawing applies 
to Snell's law, and you can compute the travel time of the light as a function of 
the coordinate x at which the light hits the surface and enters the higher index 
medium. 




16.6 Analyze the path of light over a roadway starting from Eq. (16.23) but using x as the independent 
variable instead of y. 

16.7 (a) Fill in the steps leading to Eq. (16.31). And do you understand the point of the rearrangements 
that I did just preceding it? Also, can you explain why the form of the function Eq. (16.30) should have 
been obvious without solving any extra boundary conditions? (b) When you can explain that in a few 
words, then what general cubic polynomial can you use to get a still better result? 

16.8 For the function F{x,y,y') = x"^ +y^ + y'^, explicitly carry through the manipulations leading 
to Eq. (16.41). 

16.9 Use the explicit variation in Eq. (16.8) and find the minimum of that function of e. Compare that 
minimum to the value found in Eq. (16.11). Ans: 1.64773 

16.10 Do either of the functions, Eqs. (16.29) or (16.30), satisfy Laplace's equation? 

16.11 For the function F{x, y, y') = + + y'^, repeat the calculation of 61 only now keep all the 
higher order terms in 5y and 6y' . Show that the solution Eq. (16.11) is a minimum. 

16.12 Use the techniques leading to Eq. (16.21) in order to solve the brachistochrone problem 
Eq. (16.13) again. This time use x as the independent variable instead of y. 
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16.13 On a right circular cylinder, find the path that represents the shortest distance between two 
points, dp = dz^ + R'^d(j)^. 

16.14 Put two circular loops of wire in a soap solution and draw them out, keeping their 
planes parallel. If they are fairly close you will have a soap film that goes from one ring 
to the other, and the minimum energy solution is the one with the smallest area. What is 
the shape of this surface? Use cylindrical coordinates to describe the surface. It is called 
a catenoid, and its equation involves a hyperbolic cosine. 

16.15 There is one part of the derivation going from Eq. (16.33) to (16.36) that I omitted: 
the special cases of £ = 1 and £ = N — 1. Go back and finish that detail, showing that 
you get the same result even in this case. 

16.16 Section 16.6 used the trapezoidal rule for numerical integration and the two-point centered 
difference for differentiation. What happens to the derivation if (a) you use Simpson's rule for integration 
or if (b) you use an asymmetric differentiation formula such as y'{0) ^ [y{h) — y{0)]/h7 

16.17 For the simple harmonic oscillator, L = mx"^ — /2. Use the time interval < t < T so 

that 5 = Jq L (it, and find the equation of motion from ^S/^a; = 0. When the independent variable x 
is changed to x + 6x, keep the second order terms in computing 6S this time and also make an explicit 
choice of 

5x{t) = esm{mTt/T) 

For integer n = 1,2. . . this satisfies the boundary conditions that 5x{0) = 5x{T) = 0. Evaluate the 
change is S through second order in e (that is, do it exactly). Find the conditions on the interval T so 
that the solution to 5S/5x = is in fact a minimum. Then find the conditions when it isn't, and what 
is special about the T for which S[x] changes its structure? Note: This T is defined independently 
from tu. It's specifies an arbitrary time interval for the integral. 

16.18 Eq. (16.37) describes a particle with a specified potential energy. For a charge in an electro- 
magnetic field let U = qV {xi, X2, X3,t) where V is the electric potential. Now how do you include 
magnetic effects? Add another term to L of the form Cf-A{xi,X2,Xs,t). Figure out what the 
Lagrange equations are, making 5S/5x]^ = 0. What value must C have in order that this matches 
F = q{E + v X B) = ma with B = V x Al What is E in terms of V and A7 Don't forget the chain 
rule. Ans: C = g and then E = -W - dA/dt 

16.19 (a) For the solutions that spring from Eq. (16.46), which of the three results shown have the 
largest and smallest values of / fdx7 Draw a graph of f{y') and see where the characteristic slope of 
Eq. (16.46) is with respect to the graph. 

(b) There are circumstances for which these kinked solutions, Eq. (16.47) do and do not occur; find 
them and explain them. 

16.20 What are the Euler-Lagrange equations for I[y] = j^dx F{x,y,y' ,y")l 

16.21 The equation for the focal length of a thin lens, Eq. (16.49), is not the traditional one found in 
most texts. That is usually expressed in terms of the radii of curvature of the lens surfaces. Show that 
this is the same. Also note that Eq. (16.49) is independent of the optics sign conventions for curvature. 
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16.22 The equation (16.25) is an approximate solution to the path for light above a hot road. Is there 
a function n = f{y) representing the index of refraction above the road surface such that this equation 
would be its exact solution? 

16.23 On the first page of this chapter, you see the temperature dependence of length measurements. 

(a) Take a metal disk of radius a and place it centered on a block of ice. Assume that the metal 
reaches an equilibrium temperature distribution T(r) = To(r^/a^ — 1). The temperature at the edge 
is T = 0, and the ruler is calibrated there. The disk itself remains flat. Measure the distance from the 
origin straight out to the radial coordinate r. Call this measured radius s. Measure the circumference 
of the circle at this radius and then express this circumference in terms of the measured radius s. 

(b) On a sphere of radius R (constant temperature) start at the pole {9 = 0) and write the distance 
along the arc at constant (J) down to the angle 6. Now go around the circle at this constant 9 and write 
its circumference. Express this circumference in terms of the distance you just wrote for the "radius" 
of this circle. 

(c) Show that the geometry you found in (a) is the same as that in (b) and find the radius of the 
sphere that this "heat metric" expresses. Ans: R = a/2^^aTQ{l — aTo) aj2\JaT^ 

16.24 Using the same techniques as in section 16.5, apply these methods to two concentric spheres. 
Again, use a linear and then a quadratic approximation. Before you do this, go back to Eq. (16.30) 
and see if you can arrive at that form directly, without going through all the manipulations of solving 
for a and j3. That is, determine how you could have gotten to (16.30) easily. Check some numbers 
against the exact answer. 

16.25 For the variational problem Eq. (16.45) one solution \s y = bx/a. Assume that a, /3 > and 
determine if this is a minimum or maximum or neither. Do this also for the other solution, Eq. (16.47). 
Ans: The first is min \f b/a > \J [3 /Qa. The kinked solution is always a minimum. 

16.26 If you can construct glass with a variable index of refraction, you can make 

a flat lens with an index that varies with distance from the axis. What function ^^^-^-^^ 
of distance must the index n{r) be in order that this flat cylinder of glass of 
thickness t has a focal length /? All small angles and thin lenses of course. 
Ans: n(r) = ra(0) - r'^/2ft. 



